Clustering to identify RNA conformations constrained by secondary structure.
نویسندگان
چکیده
RNA often folds hierarchically, so that its sequence defines its secondary structure (helical base-paired regions connected by single-stranded junctions), which subsequently defines its tertiary fold. To preserve base-pairing and chain connectivity, the three-dimensional conformations that RNA can explore are strongly confined compared to when secondary structure constraints are not enforced. Using three examples, we studied how secondary structure confines and dictates an RNA's preferred conformations. We made use of Macromolecular Conformations by SYMbolic programming (MC-Sym) fragment assembly to generate RNA conformations constrained by secondary structure. Then, to understand the correlations between different helix placements and orientations, we robustly clustered all RNA conformations by employing unique methods to remove outliers and estimate the best number of conformational clusters. We observed that the preferred conformation (as judged by largest cluster size) for each type of RNA junction molecule tested is consistent with its biological function. Further, the improved quality of models in our pruned datasets facilitates subsequent discrimination using scoring functions based either on statistical analysis (knowledge based) or experimental data.
منابع مشابه
A Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملRelation Between RNA Sequences, Structures, and Shapes via Variation Networks
Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...
متن کاملRNA secondary structure and qRT-PCR analyses pertained to expressed anti-CD25 CAR in NK-92 cell line
Background and Objectives: Tumor-infiltrating regulatory T (TI-Treg) cells perform the significant function in cancer immune escape. In this study, the third generation CAR construct was designed against human CD25 antigen, the significant cell surface biomarker of TI-Tregs. Methods: Initially, the construct of anti-CD25 CAR was designed. Using RNAfold web server, the RNA secondary structure wa...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 108 9 شماره
صفحات -
تاریخ انتشار 2011